Search CORE

52 research outputs found

Temporal multimodal video and lifelog retrieval

Author: Heller Silvan
Publication venue
Publication date: 01/01/2023
Field of study

The past decades have seen exponential growth of both consumption and production of data, with multimedia such as images and videos contributing significantly to said growth. The widespread proliferation of smartphones has provided everyday users with the ability to consume and produce such content easily. As the complexity and diversity of multimedia data has grown, so has the need for more complex retrieval models which address the information needs of users. Finding relevant multimedia content is central in many scenarios, from internet search engines and medical retrieval to querying one's personal multimedia archive, also called lifelog. Traditional retrieval models have often focused on queries targeting small units of retrieval, yet users usually remember temporal context and expect results to include this. However, there is little research into enabling these information needs in interactive multimedia retrieval. In this thesis, we aim to close this research gap by making several contributions to multimedia retrieval with a focus on two scenarios, namely video and lifelog retrieval. We provide a retrieval model for complex information needs with temporal components, including a data model for multimedia retrieval, a query model for complex information needs, and a modular and adaptable query execution model which includes novel algorithms for result fusion. The concepts and models are implemented in vitrivr, an open-source multimodal multimedia retrieval system, which covers all aspects from extraction to query formulation and browsing. vitrivr has proven its usefulness in evaluation campaigns and is now used in two large-scale interdisciplinary research projects. We show the feasibility and effectiveness of our contributions in two ways: firstly, through results from user-centric evaluations which pit different user-system combinations against one another. Secondly, we perform a system-centric evaluation by creating a new dataset for temporal information needs in video and lifelog retrieval with which we quantitatively evaluate our models. The results show significant benefits for systems that enable users to specify more complex information needs with temporal components. Participation in interactive retrieval evaluation campaigns over multiple years provides insight into possible future developments and challenges of such campaigns

edoc

The PS-Battles Dataset - an Image Collection for Image Manipulation Detection

Author: Heller Silvan
Rossetto Luca
Schuldt Heiko
Publication venue
Publication date: 13/04/2018
Field of study

The boost of available digital media has led to a significant increase in derivative work. With tools for manipulating objects becoming more and more mature, it can be very difficult to determine whether one piece of media was derived from another one or tampered with. As derivations can be done with malicious intent, there is an urgent need for reliable and easily usable tampering detection methods. However, even media considered semantically untampered by humans might have already undergone compression steps or light post-processing, making automated detection of tampering susceptible to false positives. In this paper, we present the PS-Battles dataset which is gathered from a large community of image manipulation enthusiasts and provides a basis for media derivation and manipulation detection in the visual domain. The dataset consists of 102'028 images grouped into 11'142 subsets, each containing the original image as well as a varying number of manipulated derivatives.Comment: The dataset introduced in this paper can be found on https://github.com/dbisUnibas/PS-Battle

arXiv.org e-Print Archive

edoc

Competitive Interactive Video Retrieval in Virtual Reality with vitrivr-VR

Author: Gasser Ralph
Heller Silvan
Rossetto Luca
Sauter Loris
Schuldt Heiko
Spiess Florian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Virtual Reality (VR) has emerged and developed as a new modality to interact with multimedia data. In this paper, we present vitrivr-vr, a prototype of an interactive multimedia retrieval system in VR based on the open source full-stack multimedia retrieval system vitrivr. We have implemented query formulation tailored to VR: Users can use speech-to-text to search collections via text for concepts, OCR and ASR data as well as entire scene descriptions through a video-text co-embedding feature that embeds sentences and video sequences into the same feature space. Result presentation and relevance feedback in vitrivr-VR leverages the capabilities of virtual spaces

edoc

ZORA

Deep Learning-based Concept Detection in vitrivr at the Video Browser Showdown 2019 - Final Notes

Author: Gasser Ralph
Giangreco Ivan
Heller Silvan
Parian Mahnaz Amiri
Rossetto Luca
Schuldt Heiko
Publication venue
Publication date: 01/01/2019
Field of study

This paper presents an after-the-fact summary of the participation of the vitrivr system to the 2019 Video Browser Showdown. Analogously to last year's report, the focus of this paper lies on additions made since the original publication and the system's performance during the competition

arXiv.org e-Print Archive

edoc

Towards Explainable Interactive Multi-Modal Video Retrieval with vitrivr

Author: Gasser Ralph
Heller Silvan
Illi Cristina
Pasquinelli Maurizio
Sauter Loris
Schuldt Heiko
Spiess Florian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

This paper presents the most recent iteration of the vitrivr multimedia retrieval system for its participation in the Video Browser Showdown (VBS) 2021. Building on existing functionality for interactive multi-modal retrieval, we overhaul query formulation and results presentation for queries which specify temporal context, extend our database with index structures for similarity search and present experimental functionality aimed at improving the explainability of results with the objective of better supporting users in the selection of results and the provision of relevance feedback

edoc

Exploring Intuitive Lifelog Retrieval and Interaction Modes in Virtual Reality with vitrivr-VR

Author: Gasser Ralph
Heller Silvan
Rossetto Luca
Sauter Loris
Schuldt Heiko
Spiess Florian
van Zanten Milan
Publication venue: ACM
Publication date: 01/01/2021
Field of study

The multimodal nature of lifelog data collections poses unique challenges for multimedia management and retrieval systems. The Lifelog Search Challenge (LSC) offers an annual evaluation platform for such interactive retrieval systems. They compete against one another in finding items of interest within a set time frame. In this paper, we present the multimedia retrieval system vitrivr-vr, the latest addition to the vitrivr stack, which participated in the LSC in recent years. vitrivr-vr leverages the 3D space in virtual reality (VR) to offer novel retrieval and user interaction models, which we describe with a special focus on design decisions taken for the participation in the LSC

edoc

ZORA

Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown

Author: Bailer Werner
Gsteiger Viktor
Gurrin Cathal
Heiko Schuldt
Heller Silvan
Jónsson Björn Þór
Leibetseder Andreas
Lokoč Jakub
Mejzlík František
Peska Ladislav
Rossetto Luca
Schall Konstantin
Schoeffmann Klaus
Schuldt Spiess
Tran Ly-Duyen
Vadicamo Lucia
Veselý Patrik
Vrochidis Stefanos
Wu Jiaxin
Publication venue: Springer
Publication date: 01/01/2022
Field of study

The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video collections. For the first time in its ten year history, the Video Browser Showdown 2021 was organized in a fully remote setting and hosted a record number of sixteen scoring systems. In this paper, we describe the competition setting, tasks and results and give an overview of state-of-the-art methods used by the competing systems. By looking at query result logs provided by ten systems, we analyze differences in retrieval model performances and browsing times before a correct submission. Through advances in data gathering methodology and tools, we provide a comprehensive analysis of ad-hoc video search tasks, discuss results, task design and methodological challenges. We highlight that almost all top performing systems utilize some sort of joint embedding for text-image retrieval and enable specification of temporal context in queries for known-item search. Whereas a combination of these techniques drive the currently top performing systems, we identify several future challenges for interactive video search engines and the Video Browser Showdown competition itself

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

A task category space for user-centric comparative multimedia search evaluations

Author: Bailer Werner
Barthel Kai Uwe
Gurrin Cathal
Heller Silvan
Jónsson Björn Þór
Lokoc Jakub
Peska Ladislav
Rossetto Luca
Schoeffmann Klaus
Vadicamo Lucia
Vrochidis Stefanos
Wu Jiaxin
Publication venue: Springer
Publication date: 01/01/2022
Field of study

In the last decade, user-centric video search competitions have facilitated the evolution of interactive video search systems. So far, these competitions focused on a small number of search task categories, with few attempts to change task category configurations. Based on our extensive experience with interactive video search contests, we have analyzed the spectrum of possible task categories and propose a list of individual axes that define a large space of possible task categories. Using this concept of category space, new user-centric video search competitions can be designed to benchmark video search systems from different perspectives. We further analyse the three task categories considered so far at the Video Browser Showdown and discuss possible (but sometimes challenging) shifts within the task category spac

ZORA

DCU Online Research Access Service

A task category space for user-centric comparative multimedia search evaluations

Author: Bailer Werner
Barthel Kai Uwe
Gurrin Cathal
Heller Silvan
Jónsson Björn Þór
Lokoc Jakub
Peska Ladislav
Rossetto Luca
Schoeffmann Klaus
Vadicamo Lucia
Vrochidis Stefanos
Wu Jiaxin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/03/2022
Field of study

DCU Online Research Access Service

Cottontail DB: An Open Source Database System for Multimedia Retrieval and Analysis

Author: Gasser Ralph
Heller Silvan
Rossetto Luca
Schuldt Heiko
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Multimedia retrieval and analysis are two important areas in "Big data" research. They have in common that they work with feature vectors as proxies for the media objects themselves. Together with metadata such as textual descriptions or numbers, these vectors describe a media object in its entirety, and must therefore be considered jointly for both storage and retrieval. In this paper we introduce Cottontail DB, an open source database management system that integrates support for scalar and vector attributes in a unified data and query model that allows for both Boolean retrieval and nearest neighbour search. We demonstrate that Cottontail DB scales well to large collection sizes and vector dimensions and provide insights into how it proved to be a valuable tool in various use cases ranging from the analysis of MRI data to realizing retrieval solutions in the cultural heritage domain

Crossref

edoc

ZORA